Clustering Symbolic Time-Series using L-tuples

نویسندگان

  • John K. Nguyen
  • Hancheng Jiang
چکیده

Among the many dimensionality reduction methods for timeseries data, Symbolic Aggregate approXimation (SAX) is perhaps the most popular due to its simplicity and uniqueness. With SAX, time-series data can be represented as string sequences which enables the utilization of methods found in text mining and bioinformatics to enhance data mining tasks. We propose an application of L-tuples to improve clustering SAX-represented time-series. Using the Ltuple frequency distributions of sequences, we compute dissimilarity based on maximum Kullback-Leibler divergence. We compare our new approach and dissimilarity measure with existing SAX measures and show that our dissimilarity measure with L-tuples is able to enhance the quality of clustering of time-series.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new approach to detect Life threatening cardiac arrhythmias using Sequential spectrum of Electrocardiogram signals

This study evaluates the discriminative power of sequential spectrum analysis of the short-term electrocardiogram (ECG) time series in separating normal and subjects with life threatening arrhythmias like, ventricular tachycardia/fibrillation (VT/VF). The raw ECG time series is transformed into a series of binary symbols and the binary occupancy or relative distribution of mono-sequences (i.e. ...

متن کامل

A new approach to detect Life threatening cardiac arrhythmias using Sequential spectrum of Electrocardiogram signals

This study evaluates the discriminative power of sequential spectrum analysis of the short-term electrocardiogram (ECG) time series in separating normal and subjects with life threatening arrhythmias like, ventricular tachycardia/fibrillation (VT/VF). The raw ECG time series is transformed into a series of binary symbols and the binary occupancy or relative distribution of mono-sequences (i.e. ...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

Fuzzy clustering of time series data: A particle swarm optimization approach

With rapid development in information gathering technologies and access to large amounts of data, we always require methods for data analyzing and extracting useful information from large raw dataset and data mining is an important method for solving this problem. Clustering analysis as the most commonly used function of data mining, has attracted many researchers in computer science. Because o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016